Expert Constrained Clustering: A Symbolic Approach

نویسندگان

  • Fabrice Rossi
  • Frédérick Vautrain
چکیده

A new constrained model is discussed as a way of incorporating efficiently a priori expert knowledge into a clustering problem of a given individual set. The first innovation is the combination of fusion constraints, which request some individuals to belong to one cluster, with exclusion constraints, which separate some individuals in different clusters. This situation implies to check the existence of a solution (ie if no pair of individuals are connected by fusion and exclusion constraints). The second novelty is that the constraints are expressed in a symbolic language that allows compact description of group of individuals according to a given interpretation. This paper studies the coherence of such constraints at individual and symbolic levels. A mathematical framework, close to the Symbolic Data Analysis[3], is built in order to define how a symbolic description space may be interpreted on a given individual set. A partial order on symbolic descriptions (which is an usual assumption of Artificial Intelligence), allows a symbolic analysis of the constraints. Our results provide an individual but also a symbolic clustering.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Normalizing Constrained Symbolic Data for Clustering

Clustering is one of the most common operation in data analysis while constrained is not so common. We present here a clustering method in the framework of Symbolic Data Analysis (S.D.A) which allows to cluster Symbolic Data. Such data can be constrained relations between the variables, expressed by rules which express the domain knowledge. But such rules can induce a combinatorial increase of ...

متن کامل

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

Symbolic Exposition of Medical Data-Sets: A Data Mining Workbench to Inductively Derive Data-Defining Symbolic Rules

The application of data mining techniques upon medical data is certainly beneficial for researchers interested in discerning the complexity of healthcare processes in real-life operational situations. In this paper we present a methodology, together with its computational implementation, for the automated extraction of data-defining CNF symbolic rules from medical data-sets comprising both anno...

متن کامل

Constrained Co-clustering of Gene Expression Data

In many applications, the expert interpretation of coclustering is easier than for mono-dimensional clustering. Co-clustering aims at computing a bi-partition that is a collection of co-clusters: each co-cluster is a group of objects associated to a group of attributes and these associations can support interpretations. Many constrained clustering algorithms have been proposed to exploit the do...

متن کامل

Image Segmentation: Type–2 Fuzzy Possibilistic C-Mean Clustering Approach

Image segmentation is an essential issue in image description and classification. Currently, in many real applications, segmentation is still mainly manual or strongly supervised by a human expert, which makes it irreproducible and deteriorating. Moreover, there are many uncertainties and vagueness in images, which crisp clustering and even Type-1 fuzzy clustering could not handle. Hence, Type-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000